-
Notifications
You must be signed in to change notification settings - Fork 6.2k
Gemma3 finetune后model.safetensors.index.json和原来不一样,导致无法使用vllm推理 #8243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
原因在这里,原来的gemma都是tie embedding的,hf copy了一份形成了lm_head并且存下来了 可行的解决方式: |
非常感谢您的回复 请问有可能训练的时候不保存lm_head么 对vllm不是很熟悉,如果要修改vllm,是改这一行么https://github.com/vllm-project/vllm/blob/7782464a1714f6081ca06f47b75e824b14316c72/vllm/model_executor/models/utils.py#L274 如果name是“lm_head”就跳过? |
那我直接在safetensor.index.json里删掉这个key是不是效果是一样的? |
我觉得应该不行,应该是根据现在safetensors里的key state来加载的 |
请问是否可以出一个转换脚本, 对vllm不太熟悉。。担心会改错,我目前尝试用 model_name_or_path: /disk2/output/gemma-3-12b-it_sft CUDA_VISIBLE_DEVICES=5,,6 API_PORT=5002 llamafactory-cli api examples/inference/gemma3.yaml |
可以把transformers更新到4.52.4, 4.52.1-3 都有一些bug |
请问需要重新训练么 |
需要 |
Reminder
System Info
llamafactory version: 0.9.3.dev0
Platform: Linux-4.18.0-348.7.1.el8_5.x86_64-x86_64-with-glibc2.28
Python version: 3.11.0
PyTorch version: 2.6.0+cu124 (GPU)
Transformers version: 4.52.3
Datasets version: 3.6.0
Accelerate version: 1.7.0
PEFT version: 0.15.2
TRL version: 0.9.6
GPU type: NVIDIA A100-SXM4-80GB
GPU number: 8
GPU memory: 79.14GB
vLLM version: 0.8.5.post1
Git commit: 2c464f329dcd798a0b6b7aaed4719b67dec0c099
Default data directory: not detected
Reproduction
model
model_name_or_path: /storage/home/westlakeLab/zhangjunlei/models/google/gemma-3-12b-it
method
stage: sft
do_train: true
finetuning_type: full
freeze_vision_tower: true # choices: [true, false]
freeze_multi_modal_projector: true # choices: [true, false]
freeze_language_model: false # choices: [true, false]
deepspeed: examples/deepspeed/ds_z2_config.json
dataset
dataset: phone_web_0131_fix_merge_1500_wait_scroll_fix_hover
template: gemma3
cutoff_len: 8192
max_samples: 1000000000
overwrite_cache: False
preprocessing_num_workers: 256
dataset_dir: /backup/lanzhenzhongLab/junleizhang/dataset
output
output_dir: /backup/lanzhenzhongLab/junleizhang/output/gemma3_phone_web_0131_fix_merge_1500_wait_scroll_fix_hover
logging_steps: 10
save_strategy: epoch
plot_loss: true
overwrite_output_dir: true
save_total_limit: 1
train
per_device_train_batch_size: 2
gradient_accumulation_steps: 2
learning_rate: 2.0e-5
num_train_epochs: 1.0
lr_scheduler_type: cosine
warmup_ratio: 0.05
bf16: true
ddp_timeout: 180000000
image_max_pixels: 1048576
report_to: wandb
mix_strategy: concat
use_fast_tokenizer: true
disable_shuffling: true
finetune之后loss正常,但是输出的模型的model.safetensors.index.json和原来模型不一样,导致报错there is no module or parameter named 'lm_head' in gemma3forconditionalgeneration
我检查了一下确实多了一个lm_head,原来模型的model.safetensors.index.json
finetune之后
Others
No response
The text was updated successfully, but these errors were encountered: